Dynamics of cortical contrast adaptation predict perception of signals in noise

Neurons throughout the sensory pathway adapt their responses depending on the statistical structure of the sensory environment. Contrast gain control is a form of adaptation in the auditory cortex, but it is unclear whether the dynamics of gain control reflect efficient adaptation, and whether they shape behavioral perception. Here, we trained mice to detect a target presented in background noise shortly after a change in the contrast of the background. The observed changes in cortical gain and behavioral detection followed the dynamics of a normative model of efficient contrast gain control; specifically, target detection and sensitivity improved slowly in low contrast, but degraded rapidly in high contrast. Auditory cortex was required for this task, and cortical responses were not only similarly affected by contrast but predicted variability in behavioral performance. Combined, our results demonstrate that dynamic gain adaptation supports efficient coding in auditory cortex and predicts the perception of sounds in noise.

mapping). Each plotted line indicates the average firing rate/prediction for 100 simulations. h, Average gain time-course of all simulations (solid colored lines) and the average estimates of (dashed gray lines). i, Simulations with 100 unique stimulus scenes, repeated 5 times each. Left panel plots the average firing rates and model fits. Right panel plots the true gain time-course (solid lines) and the average model gain estimate, (dashed lines). The shaded areas indicate 2.5 and 97.5 percentiles of the gain estimates. j, Simulations with 5 unique stimulus scenes, repeated 100 times each. Formatting as in i. For panels e-j, the GC value colors and line formatting are indicated in the legend on the bottom right. Figure 6). Relationship between behavior and gain outside of the target period. a, Correlation coefficients between the prediction of a linear-nonlinear model using STRFs estimated from the model without gain control (static-LN) versus a model with gain control (GC-LN). Each dot indicates a neuron. The red solid line indicates unity. The red "x" indicates the median correlation in each contrast. Asterisks indicate the significance of a twoway sign-rank test (p = 1.88e-295). b, Psychometric performance in low contrast, averaged based on a median split of average cortical gain during the adaptation period of the trial. Light dots and lines indicate the session average and psychometric fit to sessions in the bottom 50 th percentile of gain, while dark dots and lines indicate the same values for sessions in the top 50 th percentile of gain. Error bars indicate ±SEM across sessions (n=107). Inset: distribution of average gain in each session estimated from the adaptation period. The red dashed line indicates the median of the distribution, and the histogram bars are shaded according to whether they fall above (dark blue) or below (light blue) the median. c, Session-wise relationship between average gain in the adaptation period and psychometric threshold. Each dot indicates the gain and threshold for a single session, and its color indicates the contrast of the adaptation period. The gray line is the best linear fit to the data. The text in the lower right indicates the results of Likelihood Ratio Tests for models including gain as a predictor (in gray) or contrast as a predictor (in red). Full statistical results in Supplementary Table 1. Grey and black "ns" indicate that gain in the adaptation period and contrast, respectively, did not significantly predict psychometric slopes. d, Same as in c, but plotting psychometric slope as a function of gain.

sessions
Likelihood ratio test against model without gain: Likelihood ratio test against model without contrast: slope ~ gain_adapt + (contrast-1|mouse)

Supplementary Experimental Procedures
Acute electrophysiological recordings with muscimol or saline. Neuronal signals were recorded from n = 2 awake, untrained mice. Prior to the recording session, each mouse was anesthetized and a headpost and ground pin were implanted on the skull (see Surgery in the main text). On the day of the recording, the mouse was briefly anesthetized with 3% isoflurane and a small craniotomy was performed over auditory cortex using a dental drill or scalpel (~1mm x 1mm craniotomy centered approximately 1.25mm anterior to the lambdoid suture along caudal end of the squamosal suture). A 32-channel silicon probe (Neuronexus) was then positioned perpendicularly to the cortical surface and lowered at a rate of 1-2μm/s to a final depth of 800-1200μm. As the probe was lowered, trains of brief noise bursts were repeated, and if stimulus locked responses to the noise bursts were observed, the probe was determined to be in auditory cortex. The probe was then allowed to settle for up to 30 minutes before starting the recording.
For the muscimol and saline recordings (Supplementary Figure 5), a durotomy was performed over the injection site and baseline neuronal responses to the behavioral stimuli were recorded. Then, 2.5μL of .25mg/mL muscimol or 0.9% sterile saline solution was topically applied to the surface of auditory cortex and allowed 30 minutes to penetrate the tissue. The same stimuli were then recorded again after the elapsed time. In these recordings, the same targets and DRC background presented during behavior were presented. Neuronal signals from n = 2 mice (1 mouse for muscimol application, 1 mouse for saline application) were amplified and digitized using a Cheetah Digital LYNX system (Neuronalynx) at a rate of 32kHz.

Supplementary Results
The range of presented target levels shapes psychometric performance curves.
In the experiments presented here, we utilized several sets of target levels to assess psychometric performance (for a summary of the target levels used, see Supplementary Table 3). When computing psychometric curves across all of the target conditions for each mouse (n = 25; Supplementary Figure 4b), we found that detection thresholds were lower in low contrast (Mean dB SNR (M) = 8.56, standard deviation (std) = 1.81) compared to high contrast (M = 14.23, std = 2.57; paired t-test: t(23) = -8.19, p = 2.88e-8, Supplementary  Figure 4c). From our normative model, we expected psychometric slopes to decrease in high contrast. Instead, when combining sessions with different target levels, we found a significant increase in slope during high contrast (M = 0.056, std = 0.018) when compared to low contrast (M = 0.045, std = 0.0052; paired t-test: t(23) = -3.42, p = 0.0024, Supplementary Figure 4d). We hypothesized that psychometric performance was sensitive to the range of targets presented. To test this, we split the data by the range of target levels used in each session, finding that targets drawn from a narrow range resulted in steeper psychometric slopes than targets drawn from a wide range, regardless of the background contrast ( Supplementary Figure 4e-h). Therefore, we concluded that the paradoxical increase in psychometric slope was due to the narrow range of targets selected for many high contrast sessions. To control for these effects and thus isolate the effect of background contrast on psychometric slope, we considered only sessions with identical ranges of targets in low and high contrast and found that slopes did indeed decrease in high contrast (Figure 3).

Muscimol application disrupts cortical encoding of targets.
In n = 2 awake, naïve mice, we first recorded baseline responses to the stimuli used in the psychometric task, then topically applied muscimol or saline, waited 30 minutes, and recorded stimulus responses again. After muscimol application, there was a marked decrease in neuronal responses to targets compared to the baseline recordings ( Supplementary Figure 5b, left). Notably, in our saline control, we observed little to no change in neuronal responses after saline application (Supplementary Figure 5b, right). We next compared how contrast, level and muscimol or saline application changed the responses during the pre-and post-application periods, finding that muscimol significantly reduced the firing rates between pre-and post-application periods, while saline significantly increased firing rates (Supplementary Figure 5c,d, Supplementary Table 1). We speculate that the small increase in firing rate between pre-and post-saline application was due to changes in recording quality or due to neuronal drift over the ~1 hour recording session, and note that the effect size of saline pre-post application is very small (η 2 = 0.0046) compared to the effect size of muscimol (η 2 = 0.38). We then used a threeway ANOVA to compare the effects of muscimol, contrast, and target level on target responses in the saline and muscimol recording sessions. We found a significant main effect of muscimol (F(1) = 322.65, p = 4.88e-67) and level (F(6) = 15.48, p = 1.98e-17), but no main effect of contrast (F(1) = 0.39, p = 0.53), indicating nearly complete suppression of responses to both targets and background in high and low contrast (Supplementary Figure 5e,f). These results confirmed that muscimol effectively disrupts the cortical coding of our behavioral stimuli.

Muscimol application does not prevent licking.
An additional alternative effect of muscimol is a general loss of the ability to lick. To assess this, we monitored the lick probability of the mice throughout the trial duration, and found that muscimol specifically reduced licking responses during the period where targets were presented (rank-sum test: Z = -4.23, p = 2.34e-5; Supplementary Figure 5g, right panel of Supplementary Figure 5h). Mice also tended to lick immediately after the trial onset (Supplementary Figure 5i, green trace), but we found that the lick rates under muscimol and saline conditions were identical during this period (rank-sum test: Z = 0.23, p = 0.81; Supplementary Figure 4h, left panel). These results suggest that muscimol does not impair the mouse's ability to lick in general, but results in a specific deficit in licking in response to targets.

STRF are stable across contrasts.
To interpret the gain changes observed during the behavioral task, we first needed to eliminate the possibility that STRF structure changed across the behavioral epochs of the task (ie, between low and high contrast periods). To assess STRF stability during these two trial periods, we computed STRFs independently in low and high contrast using a GLM. Next, we selected STRFs with well-defined features by computing the SNR of each STRF. SNR was computed by calculating the ratio between the standard deviation of the STRF coefficients in the first 100ms to the SNR of the coefficients in the remaining STRF. This metric assumes that a STRF that is purely noise would have the same variability in the early and late phase of the STRF, while a STRF with stronger stimulus selectivity would have greater variability during the early phase, due to a strong response to some stimulus feature. A sampling of high SNR STRFs are plotted in Supplementary Figure 7a.
Next we selected only neurons with SNRs greater than 1.75 in both contrasts (approximately 2 standard deviations greater than the mean SNR, n = 55 neurons, Supplementary Figure 7b). Within this subset of welltuned neurons, we then computed the best frequency (BF) and lag by first computing the STRF response averaged over time and frequency, respectively, and multiplied by the standard deviation of coefficients in each bin. The BF and lag were determined to be the peak of the average frequency and time components, respectively. Additionally, we computed the range of the STRFs. As described previously 17,19 , we found no significant change in BF (sign-rank test: Z = 1.34, p = 0.18, Supplementary Figure 7c) or lag (Z = -1.29, p = 0.20, Supplementary Figure 7d), but did observe a significant increase in STRF range (Z = -5.76, p = 8.19e-9, Supplementary Figure 7e). As noted in previous work, the change in STRF range is consistent with a change in neuronal gain. Taken together, these findings demonstrate that contrast, and thus the different behavioral epochs of the task, did not significantly influence STRF features.

Generalized linear model of contrast gain control dynamics
A primary goal of the current study was to estimate the influence of stimulus contrast on neuronal gain dynamics, for instance, after a switch from one contrast to another. To approach this problem, we first define a model neuron with dynamic gain control.

Forward model
To best approximate the stimuli used in our experiments, we define the stimulus environment of our model as an -dimensional signal that evolves in discrete time steps: where ',) is a stimulus spectrogram that varies as a function of time and frequency . Each time and frequency bin of is sampled from a normal distribution defined by an average value and contrast σ ' at time .
To approximate the behavior of real neurons, we define a model neuron that has a two-dimensional linear filter (representing the STRF of the neuron): where stimulus filter β *,) is defined as a two-dimensional gaussian distribution evaluated at lag ℎ and frequency . The filter location in frequency-history space is defined by its mean and covariance matrix . The stimulus drive of the neuron at each time step, ' , is then computed as the convolution of the stimulus matrix and the linear filter: where ' at each time is a row vector of length ⋅ (ie. the unrolled stimulus spectrogram lagged by H lags) and β is the filter, unrolled as a column vector of the same length.
The model neuron has a firing rate that depends only on the stimulus drive ' and the contrast σ ' at time . We then assume that the number of spikes ' emitted by the neuron at each time step follow a Poisson distribution: where λ ' is the firing rate at time , given by where is a gain control function, and , , and are parameters of the model. The parameter represents the baseline response of the neuron, is a scaling factor of the stimulus drive, and represents the operating point of the gain. We remove the obvious degeneracy in the definition of g and b (only their product matters) by requiring that g be adimensional and such that where σ . and σare the high and low contrast values. This constraint forces the neutral value of the gain, = 1 to be the midpoint between gain in the high and low contrast conditions.

Optimal gain control
In the spirit of the efficient coding principle, we derived a form for (σ) that will guarantee that, under certain conditions, the dynamic range of the neuron will be approximately conserved under changes in contrast. To do this, we define the dynamic range as which can be rewritten using equation 2 as If the argument of the exponentials is not too large, we can linearize this expression to obtain and that is approximately independent of σ provided that (σ) ∝ 1/σ. So, for our model, we set where σ V is the harmonic mean of σ . and σ -: Finally, to validate that our fitting methods are sensitive to real world neurons, which do not necessarily adjust their gain to account for changes in contrast according to the model just described, we consider an interpolation scheme that smoothly transforms a model with positive gain control to a similar model without gain control, or with "anti" gain control. To do this, we redefine as follows: so that by changing ξ we can control whether gain control is optimal (ξ = 1), non-existant (ξ = 0), or "anti" (ξ = −1).
Putting everything together, the final expression for the firing rate of the forward model is Generalized linear model The forward model developed in the previous section provides a simple approximation of the relationship between the stimulus, stimulus contrast and neuronal responses. We also note that the form of the forward model lends itself to estimation using a Poisson GLM, provided that the predictors are chosen appropriately. As such, we define the inference model as a Poisson GLM with an intercept term and the following predictors: In other words, the model is composed of a stimulus predictor ( ' − µ), a contrast predictor (σ V/σ ' ), and their interaction. Therefore, the GLM models the data at time as a Poisson distribution with the following mean: where 2 … 4 are the parameters to be inferred, and, as defined previously, ' is the stimulus drive of the neuron determined by its STRF.

Model fitting
To fit the model, we took a two-step approach. First we found the best-fit filter (STRF) for the neuron. Then, we fit the full GLM to determine how the linear drive determined by the STRF is modulated by contrast. In the first step, the linear drive is obtained by fitting the model where ' is a design matrix defined as a function of frequency bins and history lags ℎ, and β is the fitted STRF. Stimulus drive ' is then computed as in equation 1. We then define the full model according to equation 10, where ( ) = σ V/σ ' and { 5 } 571 6 is a set of cubic B-spline temporal basis functions. By defining a matrix as follows we can rewrite equation 12 in a more compact form: where ∘ denotes element-by-element "broadcasting" multiplication.
To fit asymmetric changes in firing rate after transitions to low or high contrast, we took the simple approach of defining separate sets of contrast predictors for each transition type. This amounted to modifying by masking transitions to high contrast or transitions to low contrast with zeros, such that the model fit a window 8 of 40 time bins around each contrast transition. To do so, we created a new matrix 8 by duplicating columnwise. Then, we define the first columns as predictors for the transition to low contrast by masking a 1 second period around each transition to high contrast with zeros. This same procedure was repeated for the remaining columns in 8 , instead masking out the transition to low contrast. Substituting this into equation 14, we obtain For the sake of clarity, note that in the expression above, 2 is a number, is a column vector of length , 1 is a number, 8 is a -by-2 matrix, and 3 and 4 are column vectors of length 2 .

Defining gain
We have outlined a forward model for simulating neuronal activity according to efficient coding of stimulus contrast, and described an inference model (a Poisson GLM) for estimating the influence of the stimulus, stimulus contrast, and their interaction. In this section, we describe how to use the fitted parameters to quantify the amount of gain control in the neuron.
Conceptually, an increase or decrease in the gain of a system is analogous to more or less sensitivity to small changes in the stimulus, dependent on what is modulating the gain (in our case, the recent history of the contrast). Based on this intuition, we focus on how the response of the neuron (as modeled by a fitted GLM) is expected to change between conditions where the gain is expected to contribute (i.e. in the presence of gain control) and where it is not (ie. in the absence of gain control, where gain is "neutral").
To do this, we start by considering the gradient of the link function (the log rate) at time with respect to We can immediately read equation 16 as "the STRF of the model is modulated by a factor of β 1 + ' β 3 at time ", and define the gain based on this intuition, but we will take a slightly longer and more formal route to get to the same result. The gradient η is a vector with the same dimensionality of β 1 and ' β 3 , and it encapsulates all information about the sensitivity of the link function to small changes in ' at a given time. Because ' is not a scalar (it has ⋅ components), these changes can happen along many dimensions, and the sensitivity can be different in different directions. We can define the gain based on the sensitivity to changes in a specific direction (assuming for concreteness that t| |t = 1, although this is not necessary for the derivation below). If ' = ⋅ , where is some scalar, then by definition of the gradient. We can then define the gain along direction as the ratio between the sensitivity of the log rate to changes along and the sensitivity one would have if the contrast ' was at some reference value 2 where we define = 1 by construction. If we do so, we obtain Note that this definition does not depend on the initial choice of , or even on the specifics of the choice of basis functions used to define . In conclusion, by reasoning about the sensitivity of the response of the fitted GLM, we define a value ' which captures the relationship between the true gain and the stimulus contrast ' .

Simulations
To validate our inference model, we simulated neuronal activity according to the generative model defined in the Forward Model section (Supplementary Figure 2a). We were interested in capturing several dimensions upon which the generative model could vary, namely, the amount of gain control in the simulated neurons ξ, and the dynamics of the gain function .
To parametrically control the evolution of gain over time, we simulated different temporal trajectories of gain control, by modifying (σ ' ) as follows (σ ; , τ ; ) ' = g(σ ;01 ) + (g(σ ; ) − g(σ ;01 )) ⋅ (τ ; , ) where the gain after a switch to contrast σ ; transitions from the gain in the previous contrast (σ ;01 ) to the gain in the current contrast (σ ; ) according to an exponential function with time constant τ ; . Note that τ ; could vary between the two contrasts to simulate asymmetric dynamics. For each simulated neuron, we first generated a STRF and linear drive according to equation 1 (Supplementary Figure 2b,d). For different sets of simulated neurons, we parametrically varied the amount of gain control ξ between -1 and 1, and varied the gain time courses to simulate three types of gain adaptation dynamics: 1) Slow transitions to low contrast with fast transitions to high contrast, 2) Fast, symmetric transitions to each contrast, 3) Fast transitions to low contrast and slow transitions to high contrast (Supplementary Figure  2f).
We simulated 100 neurons for each combination of ξ and τ, with other simulation parameters held constant (Supplementary Table 4). Supplementary Figure 2e plots the average firing rates and overlaid model fits for three sets of simulations with optimal gain control (ξ = 1) while varying τ. Importantly, the model flexibly captured the gain dynamics in the three simulated adaptation conditions, with the gain estimate ' following the true gain trajectory (Supplementary Figure 2f). For additional values of ξ, the model accurately predicted the firing rates (Supplementary Figure 2g) and gain trajectories (Supplementary Figure 2h). We observed that some combinations of ξ and τ elicited large firing rate transients, particularly in the cases where simulated gain slowly adapted after a switch to high contrast (bottom panels in Supplementary Figure 2e, f, g, h). This behavior is expected, as gain remains relatively high for a longer period after the switch, causing large fluctuations in firing rate as the stimulus drive during high contrast is increased. These large firing rate transients seemed to reduce the accuracy of gain estimate , but we observed that the predicted time courses still captured the overall asymmetries present in the underlying model.
During our behavioral recordings, we used a limited number of background noise scenes (n = 5) to reduce the overall size of the stimulus set. However, it became clear that our model required a larger sample of stimulus space to accurately estimate gain. To demonstrate this, we plotted the simulation results when neurons were exposed to 100 unique background scenes (Supplementary Figure 2i) compared to simulations where neurons were only exposed to 5 unique background scenes, as in our behavioral recordings (Supplementary Figure 2j). We observed that with 100 scenes, estimates of were very close to the true gain values, but were consistently underestimated in the case of 5 background scenes, even in the case of perfect gain control. Therefore, when analyzing our behavioral recordings, we used a standard linear-nonlinear model to estimate neuronal gain ( Figure 5), as we previously found that gain estimates from the GLM were highly correlated with gain estimated from the LN model (Figure 2i).